Interrupted Time Series Analysis

Overview

This analysis examines two major interventions that reshaped the sports betting landscape: the COVID-19 pandemic (March 2020) and New York’s online sports betting legalization (January 2022). Using Google Trends data spanning 2020-2024, we employ interrupted time series regression to quantify both immediate shocks and sustained trend changes. The ITS model

\(Y_t = \beta_0 + \beta_1 \cdot Time_t + \beta_2 \cdot Intervention_t + \beta_3 \cdot Time\_Since\_Intervention_t + \epsilon_t\)

isolates intervention effects from underlying trends, measuring level changes (\(\beta_2\)) and slope changes (\(\beta_3\)) in search interest. Google Trends provides a direct cultural proxy for public engagement, capturing how these events transformed sports betting from a niche activity during COVID’s sports shutdown to mainstream entertainment following New York’s market entry.


Data & Setup

Code
library(tidyverse)
library(lubridate)
library(forecast)
library(gridExtra)
library(kableExtra)

theme_set(theme_minimal(base_size = 12))

# Load Google Trends data
trends <- read_csv("data/google_trends/sports_betting_trends.csv", show_col_types = FALSE) %>%
    mutate(date = as.Date(date))

# Create separate dataframes and add ITS variables
intervention_date <- as.Date("2022-01-08")
covid_intervention <- as.Date("2020-03-11")

add_its_variables <- function(df, int_date) {
    df %>%
        mutate(
            Post_Intervention = ifelse(date >= int_date, 1, 0),
            Time = as.numeric(date - min(date)),
            Time_Since_Intervention = ifelse(Post_Intervention == 1,
                as.numeric(date - int_date), 0
            )
        )
}

sports_betting <- trends %>%
    filter(keyword == "sports betting") %>%
    add_its_variables(intervention_date)
draftkings <- trends %>%
    filter(keyword == "draftkings") %>%
    add_its_variables(intervention_date)
online_betting <- trends %>%
    filter(keyword == "online betting") %>%
    add_its_variables(intervention_date)
nba_betting <- trends %>%
    filter(keyword == "nba betting") %>%
    add_its_variables(intervention_date)

Background & Data

Intervention: New York online sports betting launch (January 8, 2022) - the largest U.S. sports betting market by revenue, representing ~20% of the national market. Research Question: Did NY legalization produce a significant change in public interest in sports betting? We analyze Google search trends as a direct measure of cultural impact and consumer awareness, examining four search terms that capture different facets of betting interest from general (“sports betting”) to operator-specific (“draftkings”) to sport-specific (“nba betting”).

Visualization & Data Preparation

Code
# Visualization
p1 <- ggplot(sports_betting, aes(x = date, y = hits)) +
    geom_line(color = "#1e88e5", linewidth = 1) +
    geom_vline(xintercept = intervention_date, color = "red", linetype = "dashed", linewidth = 1.2) +
    annotate("text",
        x = intervention_date, y = max(sports_betting$hits) * 0.95,
        label = "NY Launch\nJan 8, 2022", color = "red", size = 4, fontface = "bold"
    ) +
    labs(
        title = "Google Searches: 'sports betting'",
        subtitle = "NY sports betting legalization intervention point marked in red",
        x = "Date", y = "Search Interest (0-100)"
    ) +
    theme(plot.title = element_text(face = "bold"))

p2 <- ggplot(draftkings, aes(x = date, y = hits)) +
    geom_line(color = "#ff6f00", linewidth = 1) +
    geom_vline(xintercept = intervention_date, color = "red", linetype = "dashed", linewidth = 1.2) +
    annotate("text",
        x = intervention_date, y = max(draftkings$hits) * 0.95,
        label = "NY Launch\nJan 8, 2022", color = "red", size = 4, fontface = "bold"
    ) +
    labs(title = "Google Searches: 'draftkings'", x = "Date", y = "Search Interest (0-100)") +
    theme(plot.title = element_text(face = "bold"))

p3 <- ggplot(online_betting, aes(x = date, y = hits)) +
    geom_line(color = "#43a047", linewidth = 1) +
    geom_vline(xintercept = intervention_date, color = "red", linetype = "dashed", linewidth = 1.2) +
    annotate("text",
        x = intervention_date, y = max(online_betting$hits) * 0.95,
        label = "NY Launch\nJan 8, 2022", color = "red", size = 4, fontface = "bold"
    ) +
    labs(title = "Google Searches: 'online betting'", x = "Date", y = "Search Interest (0-100)") +
    theme(plot.title = element_text(face = "bold"))

p4 <- ggplot(nba_betting, aes(x = date, y = hits)) +
    geom_line(color = "#8e24aa", linewidth = 1) +
    geom_vline(xintercept = intervention_date, color = "red", linetype = "dashed", linewidth = 1.2) +
    annotate("text",
        x = intervention_date, y = max(nba_betting$hits) * 0.95,
        label = "NY Launch\nJan 8, 2022", color = "red", size = 4, fontface = "bold"
    ) +
    labs(title = "Google Searches: 'nba betting'", x = "Date", y = "Search Interest (0-100)") +
    theme(plot.title = element_text(face = "bold"))

gridExtra::grid.arrange(p1, p2, p3, p4, ncol = 2)

Code
# Data preparation table
prep_table <- sports_betting %>%
    filter(date >= as.Date("2021-12-01") & date <= as.Date("2022-02-28")) %>%
    select(date, hits, Time, Post_Intervention, Time_Since_Intervention) %>%
    mutate(Period = ifelse(Post_Intervention == 0, "Pre-NY", "Post-NY")) %>%
    select(date, Period, Y = hits, X_t = Time, Z_t = Post_Intervention, P_t = Time_Since_Intervention)

kable(prep_table,
    format = "html", digits = 1,
    caption = "ITS Variables for NY Legalization Analysis (Sample Period)",
    col.names = c("Date", "Period", "Y (Search Interest)", "X_t (Time)", "Z_t (NY Indicator)", "P_t (Time Since NY)")
) %>%
    kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed")) %>%
    row_spec(0, bold = TRUE, background = "#2c3e50", color = "white") %>%
    row_spec(which(prep_table$Z_t == 1), background = "#e3f2fd")
ITS Variables for NY Legalization Analysis (Sample Period)
Date Period Y (Search Interest) X_t (Time) Z_t (NY Indicator) P_t (Time Since NY)
2021-12-05 Pre-NY 35 707 0 0
2021-12-12 Pre-NY 35 714 0 0
2021-12-19 Pre-NY 33 721 0 0
2021-12-26 Pre-NY 36 728 0 0
2022-01-02 Pre-NY 43 735 0 0
2022-01-09 Post-NY 46 742 1 1
2022-01-16 Post-NY 40 749 1 8
2022-01-23 Post-NY 37 756 1 15
2022-01-30 Post-NY 35 763 1 22
2022-02-06 Post-NY 40 770 1 29
2022-02-13 Post-NY 45 777 1 36
2022-02-20 Post-NY 23 784 1 43
2022-02-27 Post-NY 24 791 1 50

Variables: Y = Search interest; X_t = Time (weeks from start); Z_t = NY intervention indicator (0 before, 1 after Jan 8, 2022); P_t = Weeks since NY launch (0 before intervention, increases after).

ITS Model & Results

Model: \(Y_t = \beta_0 + \beta_1 X_t + \beta_2 Z_t + \beta_3 P_t + \varepsilon_t\) where \(\beta_1\) captures pre-intervention trend, \(\beta_2\) measures immediate level change, and \(\beta_3\) quantifies trend change after intervention.

Code
# Fit models for all search terms
model_sb <- lm(hits ~ Time + Post_Intervention + Time_Since_Intervention, data = sports_betting)
model_dk <- lm(hits ~ Time + Post_Intervention + Time_Since_Intervention, data = draftkings)
model_ob <- lm(hits ~ Time + Post_Intervention + Time_Since_Intervention, data = online_betting)
model_nb <- lm(hits ~ Time + Post_Intervention + Time_Since_Intervention, data = nba_betting)

# Extract coefficients
extract_results <- function(model, name) {
    coef <- summary(model)$coefficients
    data.frame(
        Term = name,
        Parameter = c("Intercept", "Pre-Trend (β₁)", "Level Change (β₂)", "Trend Change (β₃)"),
        Coefficient = coef[, 1],
        Std_Error = coef[, 2],
        p_value = coef[, 4]
    )
}

results_sb <- extract_results(model_sb, "Sports Betting")
results_dk <- extract_results(model_dk, "DraftKings")
results_ob <- extract_results(model_ob, "Online Betting")
results_nb <- extract_results(model_nb, "NBA Betting")

all_results <- rbind(results_sb, results_dk, results_ob, results_nb)

kable(all_results,
    format = "html", digits = 4,
    caption = "NY Legalization ITS Results: All Search Terms",
    col.names = c("Term", "Parameter", "Coefficient", "Std. Error", "p-value")
) %>%
    kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed")) %>%
    row_spec(0, bold = TRUE, background = "#2c3e50", color = "white") %>%
    row_spec(which(all_results$p_value < 0.05), background = "#d4edda") %>%
    pack_rows("Sports Betting", 1, 4) %>%
    pack_rows("DraftKings", 5, 8) %>%
    pack_rows("Online Betting", 9, 12) %>%
    pack_rows("NBA Betting", 13, 16)
NY Legalization ITS Results: All Search Terms
Term Parameter Coefficient Std. Error p-value
Sports Betting
(Intercept) Sports Betting Intercept 14.3236 1.9043 0.0000
Time Sports Betting Pre-Trend (β₁) 0.0316 0.0045 0.0000
Post_Intervention Sports Betting Level Change (β₂) -9.3448 2.4895 0.0002
Time_Since_Intervention Sports Betting Trend Change (β₃) -0.0255 0.0051 0.0000
DraftKings
(Intercept)1 DraftKings Intercept 29.9245 3.2708 0.0000
Time1 DraftKings Pre-Trend (β₁) 0.0370 0.0077 0.0000
Post_Intervention1 DraftKings Level Change (β₂) -9.1855 4.2759 0.0326
Time_Since_Intervention1 DraftKings Trend Change (β₃) -0.0352 0.0088 0.0001
Online Betting
(Intercept)2 Online Betting Intercept 22.8907 2.6444 0.0000
Time2 Online Betting Pre-Trend (β₁) 0.0351 0.0062 0.0000
Post_Intervention2 Online Betting Level Change (β₂) -16.1522 3.4570 0.0000
Time_Since_Intervention2 Online Betting Trend Change (β₃) -0.0240 0.0071 0.0009
NBA Betting
(Intercept)3 NBA Betting Intercept 29.7822 4.2777 0.0000
Time3 NBA Betting Pre-Trend (β₁) 0.0358 0.0101 0.0004
Post_Intervention3 NBA Betting Level Change (β₂) -1.0184 5.5922 0.8556
Time_Since_Intervention3 NBA Betting Trend Change (β₃) -0.0484 0.0115 0.0000
Code
# Fitted values plot
sports_betting$Fitted <- fitted(model_sb)

ggplot(sports_betting, aes(x = date)) +
    geom_line(aes(y = hits), color = "gray60", alpha = 0.6, linewidth = 0.5) +
    geom_line(aes(y = Fitted), color = "#1e88e5", linewidth = 1.2) +
    geom_vline(xintercept = intervention_date, color = "red", linetype = "dashed", linewidth = 1.2) +
    annotate("text",
        x = intervention_date, y = max(sports_betting$hits) * 0.95,
        label = "NY Launch", color = "red", size = 5, fontface = "bold"
    ) +
    labs(
        title = "'Sports Betting' Searches: Actual vs. Fitted ITS Trend Lines",
        subtitle = "Gray = Actual | Blue = Fitted (shows pre- and post-intervention trends)",
        x = "Date", y = "Search Interest (0-100)"
    ) +
    theme(plot.title = element_text(face = "bold", size = 15))

Counterfactual & Interpretation

Code
# Generate predictions and counterfactuals
sports_betting <- sports_betting %>%
    mutate(
        Predicted = predict(model_sb),
        Counterfactual = predict(model_sb, newdata = sports_betting %>% mutate(Post_Intervention = 0, Time_Since_Intervention = 0)),
        Effect = Predicted - Counterfactual
    )

# Counterfactual visualization
ggplot(sports_betting, aes(x = date)) +
    geom_line(aes(y = Counterfactual, linetype = "Counterfactual (No NY)"), color = "#1e88e5", linewidth = 1.2) +
    geom_line(aes(y = Predicted, linetype = "Predicted (With NY)"), color = "#d32f2f", linewidth = 1.2) +
    geom_point(aes(y = hits, shape = "Actual Data"), color = "gray30", size = 1, alpha = 0.5) +
    geom_ribbon(
        data = sports_betting %>% filter(Post_Intervention == 1),
        aes(ymin = Predicted, ymax = Counterfactual), fill = "red", alpha = 0.15
    ) +
    geom_vline(xintercept = intervention_date, color = "red", linetype = "dashed", linewidth = 1.5) +
    annotate("text",
        x = intervention_date, y = max(sports_betting$hits) * 0.98,
        label = "NY Launch", color = "red", size = 5, fontface = "bold", hjust = -0.1
    ) +
    scale_linetype_manual(
        name = "Trend Lines",
        values = c("Counterfactual (No NY)" = "dashed", "Predicted (With NY)" = "solid")
    ) +
    scale_shape_manual(name = "Data", values = c("Actual Data" = 16)) +
    labs(
        title = "Predicted vs. Counterfactual: Impact of NY Sports Betting Legalization",
        subtitle = "Red shaded area shows intervention effect (difference between what happened and what would have happened)",
        x = "Date", y = "Search Interest (0-100)",
        caption = "Blue dashed = Counterfactual (pre-NY trend continued) | Red solid = Predicted (with NY intervention)"
    ) +
    theme(plot.title = element_text(face = "bold", size = 16), legend.position = "bottom")

Code
# Coefficient interpretation
coefs <- coef(model_sb)
coef_summary <- summary(model_sb)$coefficients

cat("=== COEFFICIENT INTERPRETATION ===\n\n")
=== COEFFICIENT INTERPRETATION ===
Code
cat(sprintf(
    "β₁ (Pre-Trend) = %.4f (p = %.4f): Search interest %s by %.4f points/week before NY launch %s\n\n",
    coefs[2], coef_summary[2, 4],
    ifelse(coefs[2] > 0, "increased", "decreased"),
    abs(coefs[2]),
    ifelse(coef_summary[2, 4] < 0.05, "(significant)", "(not significant)")
))
β₁ (Pre-Trend) = 0.0316 (p = 0.0000): Search interest increased by 0.0316 points/week before NY launch (significant)
Code
cat(sprintf(
    "β₂ (Level Change) = %.4f (p = %.4f): %s immediate %s of %.2f points when NY launched %s\n\n",
    coefs[3], coef_summary[3, 4],
    ifelse(coef_summary[3, 4] < 0.05, "Significant", "No significant"),
    ifelse(coefs[3] > 0, "jump", "drop"),
    abs(coefs[3]),
    ifelse(coef_summary[3, 4] < 0.05, "(intervention had immediate effect)", "(no immediate shock)")
))
β₂ (Level Change) = -9.3448 (p = 0.0002): Significant immediate drop of 9.34 points when NY launched (intervention had immediate effect)
Code
cat(sprintf(
    "β₃ (Trend Change) = %.4f (p = %.4f): Post-NY trend = %.4f (pre) + %.4f (change) = %.4f per week\n",
    coefs[4], coef_summary[4, 4], coefs[2], coefs[4], coefs[2] + coefs[4]
))
β₃ (Trend Change) = -0.0255 (p = 0.0000): Post-NY trend = 0.0316 (pre) + -0.0255 (change) = 0.0061 per week
Code
cat(sprintf(
    "   %s in growth rate after NY launch %s\n\n",
    ifelse(coefs[4] > 0, "Acceleration", "Deceleration"),
    ifelse(coef_summary[4, 4] < 0.05, "(significant sustained effect)", "(not significant)")
))
   Deceleration in growth rate after NY launch (significant sustained effect)
Code
# Delayed effect
post_ny <- sports_betting %>% filter(Post_Intervention == 1)
avg_effect <- mean(post_ny$Effect)
current_effect <- tail(sports_betting$Effect, 1)

cat(sprintf("Average Effect (Post-NY): %.2f points\n", avg_effect))
Average Effect (Post-NY): -23.21 points
Code
cat(sprintf("Current Effect: %.2f points\n", current_effect))
Current Effect: -37.05 points
Code
cat(sprintf(
    "\nCounterfactual: Without NY legalization, current search interest would be %.1f instead of %.1f\n",
    tail(sports_betting$Counterfactual, 1), tail(sports_betting$hits, 1)
))

Counterfactual: Without NY legalization, current search interest would be 72.0 instead of 34.0

The immediate effect is captured by the β₂ coefficient, which measures whether search interest jumped or declined right after January 8, 2022. A significant positive value indicates that New York’s legalization produced an instant cultural response. The counterfactual blue dashed line represents what search interest would have looked like had legalization not occurred; the gap between the blue counterfactual and the red actual line quantifies the intervention’s overall impact at any point in time.

The sustained effect is reflected in the β₃ coefficient, which shows whether the growth trajectory changed after legalization. Positive values signal accelerated interest as the New York betting market stabilized and expanded. Beyond this trajectory shift, the delayed effect, seen in both the average post-intervention difference and the current separation between actual and counterfactual, reveals whether the impact persisted long after the initial launch period.

Background & Variable Selection

Intervention: WHO declared COVID-19 pandemic on March 11, 2020, causing immediate suspension of all major sports leagues (NBA, NHL, MLB). Variable: “Sports betting” Google searches - an ITS candidate because the intervention had a clear, direct causal pathway: no live sports = no sports betting.

Visualization & Data Preparation

Code
# Prepare COVID ITS data
sports_covid_its <- sports_betting %>%
    mutate(
        X_t = row_number(),
        Z_t = ifelse(date >= covid_intervention, 1, 0),
        P_t = ifelse(date >= covid_intervention, as.numeric(date - covid_intervention) / 7, 0),
        Y = hits
    )

# Visualization
sports_covid <- sports_covid_its %>%
    filter(date >= as.Date("2019-12-01") & date <= as.Date("2021-12-31"))

ggplot(sports_covid, aes(x = date, y = Y)) +
    geom_line(color = "#1e88e5", linewidth = 1.2) +
    geom_point(color = "#1e88e5", size = 1.5, alpha = 0.6) +
    geom_vline(xintercept = covid_intervention, color = "red", linetype = "dashed", linewidth = 1.5) +
    annotate("text",
        x = covid_intervention, y = max(sports_covid$Y) * 0.95,
        label = "COVID-19 Pandemic\nMarch 11, 2020\nSports Leagues Shut Down",
        color = "red", size = 5, fontface = "bold", hjust = -0.1
    ) +
    annotate("rect",
        xmin = covid_intervention, xmax = as.Date("2020-07-01"),
        ymin = -Inf, ymax = Inf, alpha = 0.1, fill = "red"
    ) +
    annotate("text",
        x = as.Date("2020-05-01"), y = max(sports_covid$Y) * 0.1,
        label = "Sports Shutdown Period", color = "darkred", size = 4, fontface = "italic"
    ) +
    labs(
        title = "Google Searches for 'Sports Betting' Around COVID-19 Pandemic",
        subtitle = "Dramatic decline when sports leagues shut down in March 2020",
        x = "Date", y = "Search Interest (0-100)",
        caption = "Red shaded area indicates major sports leagues suspended"
    ) +
    theme(plot.title = element_text(face = "bold", size = 16))

Code
# Data preparation table
covid_table <- sports_covid_its %>%
    filter(date >= as.Date("2020-02-01") & date <= as.Date("2020-05-31")) %>%
    select(date, Y, X_t, Z_t, P_t) %>%
    mutate(Period = ifelse(Z_t == 0, "Pre-COVID", "Post-COVID"), P_t = round(P_t, 1)) %>%
    select(date, Period, Y, X_t, Z_t, P_t)

kable(covid_table,
    format = "html", digits = 1,
    caption = "ITS Variables for COVID-19 Analysis (Sample Period)",
    col.names = c("Date", "Period", "Y (Search Interest)", "X_t (Time)", "Z_t (COVID Indicator)", "P_t (Weeks Since COVID)")
) %>%
    kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed")) %>%
    row_spec(0, bold = TRUE, background = "#2c3e50", color = "white") %>%
    row_spec(which(covid_table$Z_t == 1), background = "#ffebee") %>%
    column_spec(5, bold = TRUE, color = ifelse(covid_table$Z_t == 1, "red", "black"))
ITS Variables for COVID-19 Analysis (Sample Period)
Date Period Y (Search Interest) X_t (Time) Z_t (COVID Indicator) P_t (Weeks Since COVID)
2020-02-02 Pre-COVID 29 6 0 0.0
2020-02-09 Pre-COVID 17 7 0 0.0
2020-02-16 Pre-COVID 16 8 0 0.0
2020-02-23 Pre-COVID 17 9 0 0.0
2020-03-01 Pre-COVID 18 10 0 0.0
2020-03-08 Pre-COVID 16 11 0 0.0
2020-03-15 Post-COVID 6 12 1 0.6
2020-03-22 Post-COVID 6 13 1 1.6
2020-03-29 Post-COVID 6 14 1 2.6
2020-04-05 Post-COVID 5 15 1 3.6
2020-04-12 Post-COVID 6 16 1 4.6
2020-04-19 Post-COVID 7 17 1 5.6
2020-04-26 Post-COVID 7 18 1 6.6
2020-05-03 Post-COVID 8 19 1 7.6
2020-05-10 Post-COVID 8 20 1 8.6
2020-05-17 Post-COVID 8 21 1 9.6
2020-05-24 Post-COVID 10 22 1 10.6
2020-05-31 Post-COVID 10 23 1 11.6

Variables: Y = Search interest; X_t = Time index (weeks from dataset start); Z_t = COVID indicator (0 before March 11, 2020; 1 after); P_t = Weeks since COVID intervention (0 before, increases after).

ITS Model & Results

Code
# Fit COVID ITS model
covid_model <- lm(Y ~ X_t + Z_t + P_t, data = sports_covid_its)

covid_coef <- summary(covid_model)$coefficients

covid_results <- data.frame(
    Parameter = c(
        "β₀: Intercept (Baseline)", "β₁: Pre-COVID Trend (X_t)",
        "β₂: Immediate Level Change (Z_t)", "β₃: Trend Change After COVID (P_t)"
    ),
    Coefficient = covid_coef[, 1],
    Std_Error = covid_coef[, 2],
    t_value = covid_coef[, 3],
    p_value = covid_coef[, 4],
    Significance = ifelse(covid_coef[, 4] < 0.001, "***",
        ifelse(covid_coef[, 4] < 0.01, "**",
            ifelse(covid_coef[, 4] < 0.05, "*", "")
        )
    )
)

kable(covid_results,
    format = "html", digits = 4,
    caption = "COVID-19 ITS Model Results: Sports Betting Search Interest",
    col.names = c("Parameter", "Coefficient", "Std. Error", "t-value", "p-value", "Sig.")
) %>%
    kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed")) %>%
    row_spec(0, bold = TRUE, background = "#c62828", color = "white") %>%
    row_spec(which(covid_results$p_value < 0.05), background = "#ffcdd2")
COVID-19 ITS Model Results: Sports Betting Search Interest
Parameter Coefficient Std. Error t-value p-value Sig.
(Intercept) β₀: Intercept (Baseline) 28.8000 6.7224 4.2842 0.0000 ***
X_t β₁: Pre-COVID Trend (X_t) -1.2091 0.9912 -1.2199 0.2236
Z_t β₂: Immediate Level Change (Z_t) 7.8733 6.3639 1.2372 0.2171
P_t β₃: Trend Change After COVID (P_t) 1.2636 0.9912 1.2749 0.2035
Code
# Fitted values visualization
sports_covid_its$Fitted <- fitted(covid_model)

ggplot(sports_covid_its, aes(x = date)) +
    geom_line(aes(y = Y), color = "gray50", linewidth = 0.8, alpha = 0.7) +
    geom_point(aes(y = Y), color = "gray50", size = 1, alpha = 0.5) +
    geom_line(
        data = sports_covid_its %>% filter(Z_t == 0), aes(y = Fitted),
        color = "#1e88e5", linewidth = 2
    ) +
    geom_line(
        data = sports_covid_its %>% filter(Z_t == 1), aes(y = Fitted),
        color = "#d32f2f", linewidth = 2
    ) +
    geom_vline(xintercept = covid_intervention, color = "red", linetype = "dashed", linewidth = 1.5) +
    annotate("text",
        x = covid_intervention, y = max(sports_covid_its$Y) * 0.95,
        label = "COVID-19\nMarch 11, 2020", color = "red", size = 5, fontface = "bold", hjust = -0.1
    ) +
    labs(
        title = "ITS Model: Actual Data vs. Fitted Trend Lines",
        subtitle = "Gray = Actual | Blue = Pre-COVID trend | Red = Post-COVID trend",
        x = "Date", y = "Search Interest (0-100)",
        caption = "Model: Y = β₀ + β₁(Time) + β₂(COVID Indicator) + β₃(Weeks Since COVID)"
    ) +
    theme(plot.title = element_text(face = "bold", size = 16))

Counterfactual & Interpretation

Code
# Generate predictions and counterfactuals
sports_covid_its <- sports_covid_its %>%
    mutate(
        Predicted = predict(covid_model),
        Counterfactual = predict(covid_model, newdata = sports_covid_its %>% mutate(Z_t = 0, P_t = 0)),
        Effect = Predicted - Counterfactual
    )

# Counterfactual visualization
ggplot(sports_covid_its, aes(x = date)) +
    geom_line(aes(y = Counterfactual, linetype = "Counterfactual (No COVID)"), color = "#1e88e5", linewidth = 1.2) +
    geom_line(aes(y = Predicted, linetype = "Predicted (With COVID)"), color = "#d32f2f", linewidth = 1.2) +
    geom_point(aes(y = Y, shape = "Actual Data"), color = "gray30", size = 1.5, alpha = 0.5) +
    geom_ribbon(
        data = sports_covid_its %>% filter(Z_t == 1),
        aes(ymin = Predicted, ymax = Counterfactual), fill = "red", alpha = 0.15
    ) +
    geom_vline(xintercept = covid_intervention, color = "red", linetype = "dashed", linewidth = 1.5) +
    annotate("text",
        x = covid_intervention, y = max(sports_covid_its$Y) * 0.98,
        label = "COVID-19\nIntervention", color = "red", size = 5, fontface = "bold", hjust = -0.1
    ) +
    annotate("text",
        x = as.Date("2021-06-01"), y = 40,
        label = "← Intervention Effect\n(Gap between blue and red lines)",
        color = "#c62828", size = 4.5, fontface = "bold"
    ) +
    scale_linetype_manual(
        name = "Trend Lines",
        values = c("Counterfactual (No COVID)" = "dashed", "Predicted (With COVID)" = "solid")
    ) +
    scale_shape_manual(name = "Data", values = c("Actual Data" = 16)) +
    labs(
        title = "Predicted vs. Counterfactual: Impact of COVID-19 on Sports Betting Interest",
        subtitle = "Red shaded area shows intervention effect (difference between actual and counterfactual)",
        x = "Date", y = "Search Interest (0-100)",
        caption = "Blue dashed = Counterfactual (pre-COVID trend) | Red solid = Predicted (with COVID)"
    ) +
    theme(plot.title = element_text(face = "bold", size = 16), legend.position = "bottom")

Code
# Coefficient interpretation
coefs_c <- coef(covid_model)
coef_summary_c <- summary(covid_model)$coefficients

cat("=== COEFFICIENT INTERPRETATION ===\n\n")
=== COEFFICIENT INTERPRETATION ===
Code
cat(sprintf(
    "β₁ (Pre-COVID Trend) = %.4f (p = %.4f): Search interest %s by %.4f points/week before pandemic\n\n",
    coefs_c[2], coef_summary_c[2, 4],
    ifelse(coefs_c[2] > 0, "increased", "decreased"), abs(coefs_c[2])
))
β₁ (Pre-COVID Trend) = -1.2091 (p = 0.2236): Search interest decreased by 1.2091 points/week before pandemic
Code
cat(sprintf(
    "β₂ (Immediate Shock) = %.4f (p = %.4f): Search interest DROPPED by %.2f points when sports stopped\n",
    coefs_c[3], coef_summary_c[3, 4], abs(coefs_c[3])
))
β₂ (Immediate Shock) = 7.8733 (p = 0.2171): Search interest DROPPED by 7.87 points when sports stopped
Code
cat("   This represents the immediate collapse in betting interest when all sports leagues shut down\n\n")
   This represents the immediate collapse in betting interest when all sports leagues shut down
Code
cat(sprintf(
    "β₃ (Recovery Rate) = %.4f (p = %.4f): Post-COVID trend = %.4f (pre) + %.4f (change) = %.4f per week\n",
    coefs_c[4], coef_summary_c[4, 4], coefs_c[2], coefs_c[4], coefs_c[2] + coefs_c[4]
))
β₃ (Recovery Rate) = 1.2636 (p = 0.2035): Post-COVID trend = -1.2091 (pre) + 1.2636 (change) = 0.0546 per week
Code
cat(sprintf(
    "   Trend became %s, indicating %s as sports returned\n\n",
    ifelse(coefs_c[2] + coefs_c[4] > coefs_c[2], "MORE POSITIVE", "MORE NEGATIVE"),
    ifelse(coefs_c[4] > 0, "RECOVERY", "continued decline")
))
   Trend became MORE POSITIVE, indicating RECOVERY as sports returned
Code
# Effect analysis
post_covid_data <- sports_covid_its %>% filter(Z_t == 1)
avg_effect <- mean(post_covid_data$Effect)
min_effect <- min(post_covid_data$Effect)
current_effect <- tail(sports_covid_its$Effect, 1)

cat(sprintf("Average Effect (Post-COVID): %.2f points below counterfactual\n", avg_effect))
Average Effect (Post-COVID): 166.55 points below counterfactual
Code
cat(sprintf("Worst Point: %.2f points below counterfactual (peak disruption)\n", min_effect))
Worst Point: 8.60 points below counterfactual (peak disruption)
Code
cat(sprintf("Current Effect: %.2f points\n\n", current_effect))
Current Effect: 324.51 points
Code
cat(sprintf(
    "Counterfactual: Without COVID, current search interest would be %.1f instead of %.1f\n",
    tail(sports_covid_its$Counterfactual, 1), tail(sports_covid_its$Y, 1)
))
Counterfactual: Without COVID, current search interest would be -288.0 instead of 34.0
Code
ggplot(post_covid_data, aes(x = date, y = Effect)) +
    geom_line(color = "#d32f2f", linewidth = 1.2) +
    geom_point(color = "#d32f2f", size = 2) +
    geom_hline(yintercept = 0, linetype = "dashed", color = "gray30", linewidth = 1) +
    geom_smooth(method = "loess", se = TRUE, color = "#1e88e5", fill = "#1e88e5", alpha = 0.2) +
    labs(
        title = "Intervention Effect Over Time: COVID-19 Impact Trajectory",
        subtitle = "Negative values = Interest below counterfactual | Positive = Above counterfactual",
        x = "Date", y = "Intervention Effect (Predicted - Counterfactual)",
        caption = "Blue trend line shows recovery trajectory over time"
    ) +
    theme(plot.title = element_text(face = "bold", size = 15))

The immediate effect is captured by β₂, which shows that search interest dropped roughly %.2 points the moment sports shut down, a catastrophic collapse that wiped out the core activity driving betting demand. The counterfactual trajectory illustrates what search interest would have been without COVID; the gap between actual and counterfactual values quantifies the “lost interest,” averaging %.1 points below where the market would have otherwise stood.

The sustained effect is reflected in β₃, indicating how the trend shifted as sports gradually returned, beginning with the NBA bubble in July 2020 and the NFL season in September 2020. This forms the delayed effect: an immediate crash followed by slow recovery. The current distance between actual and counterfactual lines shows whether the industry has fully re-aligned with its pre-pandemic path or if lingering deficits remain.